11 research outputs found
Stratification Trees for Adaptive Randomization in Randomized Controlled Trials
This paper proposes an adaptive randomization procedure for two-stage
randomized controlled trials. The method uses data from a first-wave experiment
in order to determine how to stratify in a second wave of the experiment, where
the objective is to minimize the variance of an estimator for the average
treatment effect (ATE). We consider selection from a class of stratified
randomization procedures which we call stratification trees: these are
procedures whose strata can be represented as decision trees, with differing
treatment assignment probabilities across strata. By using the first wave to
estimate a stratification tree, we simultaneously select which covariates to
use for stratification, how to stratify over these covariates, as well as the
assignment probabilities within these strata. Our main result shows that using
this randomization procedure with an appropriate estimator results in an
asymptotic variance which is minimal in the class of stratification trees.
Moreover, the results we present are able to accommodate a large class of
assignment mechanisms within strata, including stratified block randomization.
In a simulation study, we find that our method, paired with an appropriate
cross-validation procedure ,can improve on ad-hoc choices of stratification. We
conclude by applying our method to the study in Karlan and Wood (2017), where
we estimate stratification trees using the first wave of their experiment
Inference for Matched Tuples and Fully Blocked Factorial Designs
This paper studies inference in randomized controlled trials with multiple
treatments, where treatment status is determined according to a "matched
tuples" design. Here, by a matched tuples design, we mean an experimental
design where units are sampled i.i.d. from the population of interest, grouped
into "homogeneous" blocks with cardinality equal to the number of treatments,
and finally, within each block, each treatment is assigned exactly once
uniformly at random. We first study estimation and inference for matched tuples
designs in the general setting where the parameter of interest is a vector of
linear contrasts over the collection of average potential outcomes for each
treatment. Parameters of this form include standard average treatment effects
used to compare one treatment relative to another, but also include parameters
which may be of interest in the analysis of factorial designs. We first
establish conditions under which a sample analogue estimator is asymptotically
normal and construct a consistent estimator of its corresponding asymptotic
variance. Combining these results establishes the asymptotic exactness of tests
based on these estimators. In contrast, we show that, for two common testing
procedures based on t-tests constructed from linear regressions, one test is
generally conservative while the other generally invalid. We go on to apply our
results to study the asymptotic properties of what we call "fully-blocked" 2^K
factorial designs, which are simply matched tuples designs applied to a full
factorial experiment. Leveraging our previous results, we establish that our
estimator achieves a lower asymptotic variance under the fully-blocked design
than that under any stratified factorial design which stratifies the
experimental sample into a finite number of "large" strata. A simulation study
and empirical application illustrate the practical relevance of our results
Inference for Cluster Randomized Experiments with Non-ignorable Cluster Sizes
This paper considers the problem of inference in cluster randomized
experiments when cluster sizes are non-ignorable. Here, by a cluster randomized
experiment, we mean one in which treatment is assigned at the level of the
cluster; by non-ignorable cluster sizes we mean that the distribution of
potential outcomes, and the treatment effects in particular, may depend
non-trivially on the cluster sizes. In order to permit this sort of
flexibility, we consider a sampling framework in which cluster sizes themselves
are random. In this way, our analysis departs from earlier analyses of cluster
randomized experiments in which cluster sizes are treated as non-random. We
distinguish between two different parameters of interest: the equally-weighted
cluster-level average treatment effect, and the size-weighted cluster-level
average treatment effect. For each parameter, we provide methods for inference
in an asymptotic framework where the number of clusters tends to infinity and
treatment is assigned using a covariate-adaptive stratified randomization
procedure. We additionally permit the experimenter to sample only a subset of
the units within each cluster rather than the entire cluster and demonstrate
the implications of such sampling for some commonly used estimators. A small
simulation study and empirical demonstration show the practical relevance of
our theoretical results
Inference in Experiments with Matched Pairs and Imperfect Compliance
This paper studies inference for the local average treatment effect in
randomized controlled trials with imperfect compliance where treatment status
is determined according to "matched pairs." By "matched pairs," we mean that
units are sampled i.i.d. from the population of interest, paired according to
observed, baseline covariates and finally, within each pair, one unit is
selected at random for treatment. Under weak assumptions governing the quality
of the pairings, we first derive the limiting behavior of the usual Wald (i.e.,
two-stage least squares) estimator of the local average treatment effect. We
show further that the conventional heteroskedasticity-robust estimator of its
limiting variance is generally conservative in that its limit in probability is
(typically strictly) larger than the limiting variance. We therefore provide an
alternative estimator of the limiting variance that is consistent for the
desired quantity. Finally, we consider the use of additional observed, baseline
covariates not used in pairing units to increase the precision with which we
can estimate the local average treatment effect. To this end, we derive the
limiting behavior of a two-stage least squares estimator of the local average
treatment effect which includes both the additional covariates in addition to
pair fixed effects, and show that the limiting variance is always less than or
equal to that of the Wald estimator. To complete our analysis, we provide a
consistent estimator of this limiting variance. A simulation study confirms the
practical relevance of our theoretical results. We use our results to revisit a
prominent experiment studying the effect of macroinsurance on microenterprise
in Egypt
On the Efficiency of Finely Stratified Experiments
This paper studies the efficient estimation of a large class of treatment
effect parameters that arise in the analysis of experiments. Here, efficiency
is understood to be with respect to a broad class of treatment assignment
schemes for which the marginal probability that any unit is assigned to
treatment equals a pre-specified value, e.g., one half. Importantly, we do not
require that treatment status is assigned in an i.i.d. fashion, thereby
accommodating complicated treatment assignment schemes that are used in
practice, such as stratified block randomization and matched pairs. The class
of parameters considered are those that can be expressed as the solution to a
restriction on the expectation of a known function of the observed data,
including possibly the pre-specified value for the marginal probability of
treatment assignment. We show that this class of parameters includes, among
other things, average treatment effects, quantile treatment effects, local
average treatment effects as well as the counterparts to these quantities in
experiments in which the unit is itself a cluster. In this setting, we
establish two results. First, we derive a lower bound on the asymptotic
variance of estimators of the parameter of interest in the form of a
convolution theorem. Second, we show that the n\"aive method of moments
estimator achieves this bound on the asymptotic variance quite generally if
treatment is assigned using a "finely stratified" design. By a "finely
stratified" design, we mean experiments in which units are divided into groups
of a fixed size and a proportion within each group is assigned to treatment
uniformly at random so that it respects the restriction on the marginal
probability of treatment assignment. In this sense, "finely stratified"
experiments lead to efficient estimators of treatment effect parameters "by
design" rather than through ex post covariate adjustment
Inference in Cluster Randomized Trials with Matched Pairs
This paper considers the problem of inference in cluster randomized trials
where treatment status is determined according to a "matched pairs" design.
Here, by a cluster randomized experiment, we mean one in which treatment is
assigned at the level of the cluster; by a "matched pairs" design we mean that
a sample of clusters is paired according to baseline, cluster-level covariates
and, within each pair, one cluster is selected at random for treatment. We
study the large sample behavior of a weighted difference-in-means estimator and
derive two distinct sets of results depending on if the matching procedure does
or does not match on cluster size. We then propose a variance estimator which
is consistent in either case. We also study the behavior of a randomization
test which permutes the treatment status for clusters within pairs, and
establish its finite sample and asymptotic validity for testing specific null
hypotheses
Revisiting the Analysis of Matched-Pair and Stratified Experiments in the Presence of Attrition
In this paper we revisit some common recommendations regarding the analysis
of matched-pair and stratified experimental designs in the presence of
attrition. Our main objective is to clarify a number of well-known claims about
the practice of dropping pairs with an attrited unit when analyzing
matched-pair designs. Contradictory advice appears in the literature about
whether or not dropping pairs is beneficial or harmful, and stratifying into
larger groups has been recommended as a resolution to the issue. To address
these claims, we derive the estimands obtained from the difference-in-means
estimator in a matched-pair design both when the observations from pairs with
an attrited unit are retained and when they are dropped. We find limited
evidence to support the claims that dropping pairs helps recover the average
treatment effect, but we find that it may potentially help in recovering a
convex weighted average of conditional average treatment effects. We report
similar findings for stratified designs when studying the estimands obtained
from a regression of outcomes on treatment with and without strata fixed
effects
Inference With Dyadic Data: Asymptotic Behavior of the Dyadic-Robust <i>t</i>-Statistic
<p>This article is concerned with inference in the linear model with dyadic data. Dyadic data are indexed by pairs of “units;” for example, trade data between pairs of countries. Because of the potential for observations with a unit in common to be correlated, standard inference procedures may not perform as expected. We establish a range of conditions under which a <i>t</i>-statistic with the dyadic-robust variance estimator of Fafchamps and Gubert is asymptotically normal. Using our theoretical results as a guide, we perform a simulation exercise to study the validity of the normal approximation, as well as the performance of a novel finite-sample correction. We conclude with guidelines for applied researchers wishing to use the dyadic-robust estimator for inference.</p